Expected distance between terminal nucleotides of RNA secondary structures.
نویسندگان
چکیده
In "The ends of a large RNA molecule are necessarily close", Yoffe et al. (Nucleic Acids Res 39(1):292-299, 2011) used the programs RNAfold [resp. RNAsubopt] from Vienna RNA Package to calculate the distance between 5' and 3' ends of the minimum free energy secondary structure [resp. thermal equilibrium structures] of viral and random RNA sequences. Here, the 5'-3' distance is defined to be the length of the shortest path from 5' node to 3' node in the undirected graph, whose edge set consists of edges {i, i + 1} corresponding to covalent backbone bonds and of edges {i, j} corresponding to canonical base pairs. From repeated simulations and using a heuristic theoretical argument, Yoffe et al. conclude that the 5'-3' distance is less than a fixed constant, independent of RNA sequence length. In this paper, we provide a rigorous, mathematical framework to study the expected distance from 5' to 3' ends of an RNA sequence. We present recurrence relations that precisely define the expected distance from 5' to 3' ends of an RNA sequence, both for the Turner nearest neighbor energy model, as well as for a simple homopolymer model first defined by Stein and Waterman. We implement dynamic programming algorithms to compute (rather than approximate by repeated application of Vienna RNA Package) the expected distance between 5' and 3' ends of a given RNA sequence, with respect to the Turner energy model. Using methods of analytical combinatorics, that depend on complex analysis, we prove that the asymptotic expected 5'-3' distance of length n homopolymers is approximately equal to the constant 5.47211, while the asymptotic distance is 6.771096 if hairpins have a minimum of 3 unpaired bases and the probability that any two positions can form a base pair is 1/4. Finally, we analyze the 5'-3' distance for secondary structures from the STRAND database, and conclude that the 5'-3' distance is correlated with RNA sequence length.
منابع مشابه
Secondary structural elements within the 3' untranslated region of mouse hepatitis virus strain JHM genomic RNA.
Previously, we characterized two host protein binding elements located within the 3'-terminal 166 nucleotides of the mouse hepatitis virus (MHV) genome and assessed their functions in defective-interfering (DI) RNA replication. To determine the role of RNA secondary structures within these two host protein binding elements in viral replication, we explored the secondary structure of the 3'-term...
متن کاملRelation Between RNA Sequences, Structures, and Shapes via Variation Networks
Background: RNA plays key role in many aspects of biological processes and its tertiary structure is critical for its biological function. RNA secondary structure represents various significant portions of RNA tertiary structure. Since the biological function of RNA is concluded indirectly from its primary structure, it would be important to analyze the relations between the RNA sequences and t...
متن کاملStability of RNA stem-loop structure and distribution of non-random structure in the human immunodeficiency virus (HIV-I).
The stability of potential RNA stem-loop structures in human immunodeficiency virus isolates, HTLV-III and ARV, has been calculated, and the relevance to the local significant secondary structures in the sequence has been tested statistically using a Monte Carlo simulation method. Potentially significant structures exist in the 5'non-coding region, the boundary regions between the protein codin...
متن کاملComparative sequence analysis and patterns of covariation in RNA secondary structures.
A novel method of RNA secondary structure prediction based on a comparison of nucleotide sequences is described. This method correctly predicts nearly all evolutionarily conserved secondary structures of five different RNAs: tRNA, 5S rRNA, bacterial ribonuclease P (RNase P) RNA, eukaryotic small subunit rRNA, and the 3' untranslated region (UTR) of the Drosophila bicoid (bcd) mRNA. Furthermore,...
متن کاملProbing the sequence and structure of in vitro synthesized antisense and target RNAs from the replication control system of plasmid pMV158.
Antisense RNAII is a replication control element encoded by promiscuous plasmid pMV158. RNAII binds to its complementary sequence in the copG-repB mRNA, thus inhibiting translation of the replication initiator repB gene. In order to initiate the biochemical characterization of the pMV158 antisense RNA-mediated control system, conditions for in vitro transcription by T7RNA polymerase were set up...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of mathematical biology
دوره 65 3 شماره
صفحات -
تاریخ انتشار 2012